Back

Journal of Medical Genetics

BMJ

All preprints, ranked by how well they match Journal of Medical Genetics's content profile, based on 28 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.

1
Variant curation of the largest compendium of FOXL2 coding and non-coding sequence and structural variants in BPES

Matton, C.; Van De Velde, J.; De Bruyne, M.; Van De Sompele, S.; Hooghe, S.; Syryn, H.; Bauwens, M.; D'haene, E.; Dheedene, A.; Cools, M.; Komatsuzaki, S.; Preizner-Rzucidlo, E.; Ross, A.; Armstrong, C.; Watkins, W.; Shelling, A.; Vincent, A. L.; Cassiman, C.; Vermeer, S.; Bunyan, D. J.; Verdin, H.; De Baere, E.

2026-03-02 genetic and genomic medicine 10.64898/2026.02.24.25339471 medRxiv
Top 0.1%
34.4%
Show abstract

Heterozygous FOXL2 (non-)coding sequence and structural variants (SVs) lead to blepharophimosis, ptosis and epicanthus inversus syndrome (BPES), a rare, autosomal dominant developmental disorder characterized by a completely penetrant eyelid malformation and incompletely penetrant primary ovarian insufficiency (POI). We collected variants from our in-house database, generated via clinical genetic testing and downstream research testing in the Center for Medical Genetics Ghent, Belgium (2001-2024), and via literature and other resources in the same period. All retrieved variants were categorized using ACMG/AMP classifications to increase the knowledge of pathogenicity. We collected 413 unique genetic defects of the FOXL2 region, including 76 novel variants, in 864 index patients. Of these, 87% of patients were identified with a coding FOXL2 sequence variant. The polyalanine tract is a known mutational hotspot of FOXL2, illustrated here by the high percentage of pathogenic polyalanine expansions (24%). Furthermore, the molecular spectrum in typical BPES index patients is characterized by 8% coding deletions and 3% deletions located up- and downstream of FOXL2. The remaining 2% carry translocations along with chromosomal rearrangements of 3q23. This uniform and structured reclassification, incorporating the largest dataset of variants implicated in FOXL2-associated disease so far, will improve both the diagnosis as well as genetic counselling for individuals with BPES.

2
Refining the genetic landscape of anophthalmia and microphthalmia: a comprehensive framework with deep learning and updated gene panels

Maftei, M. I.; Spink, L. G. N.; Carmona, O. G.; Mrstakova, S. M.; Abahreh, L.; Hayes, R.; Banon, A.; Cuevas, M. E.; Cid, K.; Araya-Secchi, R.; Fraternali, F.; Yu, J.; Arno, G.; Young, R.

2025-08-28 ophthalmology 10.1101/2025.08.26.25334245 medRxiv
Top 0.1%
32.9%
Show abstract

ImportanceAnophthalmia and microphthalmia (A/M) are rare congenital eye disorders with a low molecular diagnosis rate, which limits clinical management and genetic counselling. Improved detection and interpretation of pathogenic variants is essential for advancing diagnosis and care in affected individuals. ObjectiveTo improve the molecular diagnostic yield in A/M patients by refining the methodology of variant investigation and association using an updated rigorously curated gene panel, and a refined bioinformatic pipeline incorporating structural variant detection, in silico Artificial Intelligence assisted predictive tools, and molecular dynamics simulations. MethodologyWe curated an updated A/M gene panel through a systematic literature review and screened for rare variants in these genes using data from the UKs 100,000 Genomes Project, a national whole-genome sequencing initiative conducted by Genomics England. The cohort comprised 306 individuals recruited to the Rare Disease programme with a clinical diagnosis of anophthalmia or microphthalmia, recorded either as the primary phenotype or within HPO, SNOMED, or ICD-10 terms. Variants, including loss-of-function, missense, RNA splicing, and structural variants, were annotated with deep learning tools (AlphaMissense, SpliceAI), and missense variants were further assessed using REVEL, Missense3D, and molecular dynamics simulations. ResultsWe identified pathogenic or likely pathogenic variants in 37 (12.1%) individuals, with an additional 23 (7.5%) harbouring strong candidate variants of uncertain significance. Our literature review identified the biggest contributors to A/M phenotypes to be MFRP, OTX2, PRSS56 and SOX2, each with over 100 patients reported in the literature, with a total number of 124 genes found be associated to A/M. Variants from our screen were most often found in genes with high A/M association, but also included novel findings within genes with a weaker association to A/M such as ACTG1, HDAC6, RERE and SIX3; adding support to their disease relevance. Conclusions and RelevanceThis study increased the diagnostic yield in A/M patients recruited to the 100KGP, and provides further evidence of genotype-phenotype associations within the aetiology of A/M. We also provide an updated framework for enhancing clinical genetic diagnosis in A/M that may inform broader strategies for other complex congenital disorders. However, as molecular diagnosis of A/M remains low, further research in understanding the genetic aetiology of A/M is necessary.

3
Rare germline genetic variation in PAX8 transcription factor binding sites and susceptibility to epithelial ovarian cancer

Ezquina, S. A. M.; Jones, M.; Dicks, E.; de Vries, A.; Peng, P.-C.; Corona, R. I.; Lawrenson, K.; Tyrer, J. P.; Hazelett, D.; Brenton, J. D.; Antoniou, A. C.; Gayther, S. A.; Pharoah, P. D. P.

2023-03-22 genetic and genomic medicine 10.1101/2023.03.22.23287587 medRxiv
Top 0.1%
28.6%
Show abstract

Common genetic variation throughout the genome together with rare coding variants identified to date explain about a half of the inherited genetic component of epithelial ovarian cancer risk. It is likely that rare variation in the non-coding genome will explain some of the unexplained heritability, but identifying such variants is challenging. The primary problem is lack of statistical power to identifying individual risk variants by association as power is a function of sample size, effect size and allele frequency. Power can be increased by using burden tests which test for association of carriers of any variant in a specified genomic region. This has the effect of increasing the putative effect allele frequency. PAX8 is a transcription factor that plays a critical role in tumour progression, migration and invasion. Furthermore, regulatory elements proximal to target genes of PAX8 are enriched for common ovarian cancer risk variants. We hypothesised that rare variation in PAX8 binding sites are also associated with ovarian cancer risk, but unlikely to be associated with risk of breast, colorectal or endometrial cancer. We have used publicly-available, whole-genome sequencing data from the UK 100,000 Genomes Project to evaluate the burden of rare variation in PAX8 binding sites across the genome. Data were available for 522 ovarian cancers, 2560 breast cancers, 2465 colorectal cancers and 729 endometrial cancers and 2253 non-cancer controls. Active binding sites were defined using data from multiple PAX8 and H3K27 ChIPseq experiments. We found no association between the burden of rare variation in PAX8 binding sites (defined in several ways) and risk of ovarian, breast or endometrial cancer. An apparent association with colorectal cancer was likely to be a technical artefact as a similar association was also detected for rare variation in random regions of the genome. Despite the null result this study provides a proof-of -principle for using burden testing to identify rare, non-coding germline genetic variation associated with disease. Larger sample sizes available from large-scale sequencing projects together with improved understanding of the function of the non-coding genome will increase the potential of similar studies in the future.

4
Cancer risks for MSH6 pathogenic variant carriers

Werf, A.-s. v. d.; Dowty, J.; Italia, M.; Bakkker, A.; Koops, F.; Bleeker, F.; Gomez-Garcia, E.; Hest, L.; Gille, H.; Cornips, C.; Jong, M. d.; Letteboer, T.; Duijkers, F.; Wagner, A.; Eikenboom, E.; van Asperen, C.; Broeke, `Sanne; WIn, A.; Jenkins, M.; Nielsen, M.

2025-02-18 genetic and genomic medicine 10.1101/2025.02.15.25322330 medRxiv
Top 0.1%
26.7%
Show abstract

IntroductionLynch syndrome (LS) is a hereditary cancer syndrome caused by (likely) pathogenic variants (LP/P) in DNA mismatch repair genes, including MSH6. It is associated with elevated lifetime risks for colorectal cancer (CRC), endometrial cancer (EC), and other malignancies. However, cancer risks specific to MSH6-associated LS, particularly for non-colorectal cancers, remain poorly defined. This study aims to provide refined cancer risk estimates for individuals with MSH6 LP/P. MethodsWe conducted a retrospective cohort study of 360 families with 1117 known MSH6 LP/P carriers identified in the Netherlands between 1995 and 2020. Pedigree data were collected from multiple clinical centers, and cancer diagnoses were confirmed through medical records. Age- and sex-specific hazard ratios (HRs) and cumulative risks (CRs) were estimated using segregation analysis, appropriately adjusted for ascertainment. ResultsCR by age 80 for MSH6 LP/P carriers were 36% in males (95% CI:25-48%) and 21% in females (95% CI 13-32%) for CRC, and 23% in females (95% CI:15-43%) for EC. Elevated risks were observed for ovarian cancer (OC) (6.4%, 95% CI:3-14.8%; HR 5.58, p=0.00037), urinary tract cancers (10.1% in males, 4.1% in females; HR 2.52, p=0.012), and biliary tract cancers (4.9% in males, 4.2% in females; HR 2.76, p=0.031). No increased risks were identified for prostate or breast cancer. ConclusionThis study refines cancer risk estimates for MSH6 LP/P carriers, suggesting the need for delayed CRC screening in males and females and proactive discussions regarding prophylactic surgery for females to address elevated risks for EC and OC.

5
Effects of pathogenic CNVs on biochemical markers: a study on the UK Biobank

Bracher-Smith, M.; Kendall, K. M.; Rees, E.; Einon, M.; O'Donovan, M.; Owen, M. J.; Kirov, G.

2019-08-06 genomics 10.1101/723270 medRxiv
Top 0.1%
23.3%
Show abstract

BackgroundPathogenic copy number variants (CNVs) increase risk for medical disorders, even among carriers free from neurodevelopmental disorders. The UK Biobank recruited half a million adults who provided samples for biochemical and haematology tests which have recently been released. We wanted to assess how the presence of pathogenic CNVs affects these biochemical test results. MethodsWe called all CNVs from the Affymetrix microarrays and selected a set of 54 CNVs implicated as pathogenic (including their reciprocal deletions/duplications) and present in five or more persons. We used linear regression analysis to establish their association with 28 biochemical and 23 haematology tests. ResultsWe analysed 421k participants who passed our CNV quality control filters and self-reported as white British or Irish descent. There were 268 associations between CNVs and biomarkers that were significant at a false discovery rate of 0.05. Deletions at 16p11.2 had the highest number of significant associations, but several rare CNVs had higher effect sizes indicating that the lack of significance was likely due to the reduced statistical power for rarer events. The distribution of values can be visualised on our interactive website: http://kirov.psycm.cf.ac.uk/. ConclusionsCarriers of many pathogenic CNVs have changes in biochemical and haematology tests, and many of those are associated with adverse health consequences. These changes did not always correlate with increases in diagnosed medical disorders in this population. Carriers should have regular blood tests in order to identify and treat adverse medical consequences early. Levels of cholesterol and related lipids were unexpectedly lower in carriers of CNVs associated with increased weight gain, most likely due to the use of statins by such people.

6
New pathogenic variants and insights into pathogenic mechanisms in GRK1-related Oguchi disease.

Poulter, J. A.; Gravett, M.; Taylor, R. L.; Fujinami, K.; De Zaeytijd, J.; Bellingham, J.; Hayashi, T.; Kondo, M.; Donnelly, D.; Toomes, C.; Ali, M.; UK Inherited Retinal Disease Consortium, ; Genomics England Research Consortium, ; De Baere, E.; Leroy, B. P.; Davies, N. P.; Webster, A. R.; Mahroo, O. A.; Arno, G.; Black, G. C.; McKibbin, M.; Harris, S. A.; Khan, K. N.; Inglehearn, C. F.

2020-02-20 genetics 10.1101/2020.02.20.936880 medRxiv
Top 0.1%
23.2%
Show abstract

PurposeBiallelic mutations in G-Protein coupled receptor kinase 1 (GRK1) cause Oguchi disease, a rare subtype of congenital stationary night blindness (CSNB). The purpose of this study was to identify pathogenic GRK1 variants and use in-depth bioinformatic analyses to evaluate how their impact on protein structure could lead to pathogenicity. MethodsPatients genomic DNA was sequenced by whole genome, whole exome or focused exome sequencing. Pathogenic variants, published and novel, were compared to nondisease associated missense variants. The impact of GRK1 missense variants at the protein level were then predicted using a series of computational tools. ResultsWe identified eleven previously unpublished cases with biallelic pathogenic GRK1 variants, including seven novel variants, and reviewed all GRK1 pathogenic variants. Further structure-based scoring revealed a hotspot for missense variants in the kinase domain. Additionally, to aid future clinical interpretation, we identified the bioinformatics tools best able to differentiate pathogenic from non-pathogenic variants. ConclusionWe identified new GRK1 pathogenic variants in Oguchi disease patients and investigated how disease-causing variants may impede protein function, giving new insights into the mechanisms of pathogenicity. All pathogenic GRK1 variants described to date have been collated into a Leiden Open Variation Database (http://dna2.leeds.ac.uk/GRK1_LOVD/genes/GRK1).

7
ATR16 Syndrome: Mechanisms Linking Monosomy to Phenotype

Babbs, C.; Brown, J.; Horsley, S. W.; Slater, J.; Maifoshie, E.; Kumar, S.; Ooijevaar, P.; Kriek, M.; Dixon-McIver, A.; Harteveld, C. L.; Traeger-Synodinos, J.; Higgs, D.; Buckle, V. J.

2019-10-07 genetics 10.1101/768895 medRxiv
Top 0.1%
22.9%
Show abstract

BackgroundSporadic deletions removing 100s-1000s kb of DNA, and variable numbers of poorly characterised genes, are often found in patients with a wide range of developmental abnormalities. In such cases, understanding the contribution of the deletion to an individuals clinical phenotype is challenging.\n\nMethodsHere, as an example of this common phenomenon, we analysed 34 patients with simple deletions of [~]177 to [~]2000 kb affecting one allele of the well characterised, gene dense, distal region of chromosome 16 (16p13.3), referred to as ATR-16 syndrome. We characterised precise deletion extent and screened for genetic background effects, telomere position effect and compensatory up regulation of hemizygous genes.\n\nResultsWe find the risk of developmental and neurological abnormalities arises from much smaller terminal chromosome 16 deletions ([~]400 kb) than previously reported. Beyond this, the severity of ATR-16 syndrome increases with deletion size, but there is no evidence that critical regions determine the developmental abnormalities associated with this disorder. Surprisingly, we find no evidence of telomere position effect or compensatory upregulation of hemizygous genes, however, genetic background effects substantially modify phenotypic abnormalities.\n\nConclusionsUsing ATR-16 as a general model of disorders caused by sporadic copy number variations, we show the degree to which individuals with contiguous gene syndromes are affected is not simply related to the number of genes deleted but also depends on their genetic background. We also show there is no critical region defining the degree of phenotypic abnormalities in ATR-16 syndrome and this has important implications for genetic counselling.

8
Misclassification of a frequent loss of function variant from PMS2CL pseudogene as a PMS2 variant in Brazilian patients

Segura, A. V. C.; Silva, S. I. O. d.; Santiago, K. M.; Brianese, R. C.; Carraro, D. M.; Torrezan, G. T.

2024-03-27 genetic and genomic medicine 10.1101/2024.03.26.24304914 medRxiv
Top 0.1%
22.8%
Show abstract

PMS2, a Lynch Syndrome gene, presents challenges in genetic testing due to the existence of multiple pseudogenes. This study aims to describe a series of cases harboring a rare LoF variant in the PMS2CL pseudogene that has been incorrectly assigned to PMS2 with different nomenclatures. We reviewed data from 647 Brazilian patients who underwent multigene genetic testing at a single center to identify those harboring the PMS2 V1:c.2186_2187delTC or V2:c.2182_2184delACTinsG variants, allegedly located at PMS2 exon 13. Gene-specific PCR and transcript sequencing was performed. Among the 647 individuals, 1.8% (12) carried the investigated variants, with variant allele frequencies ranging from 15 to 34%. By visually inspecting the alignments, we confirmed that both V1 and V2 represented the same variant and through gene-specific PCR and PMS2 transcript analysis, we demonstrated that V1/V2 is actually located in the PMS2CL pseudogene. Genomic databases (ExAC and gnomAD) report an incidence of 2.5% - 5.3% of this variant in the African population. Currently, V1 is classified as "uncertain significance" and V2 as "conflicting" in ClinVar, with several laboratories classifying them as "pathogenic". We identified a frequent African PMS2CL LoF variant in the Brazilian population that is misclassified as a PMS2 variant. It is likely that V1/V2 have been erroneously assigned to PMS2 in several manuscripts and by clinical laboratories, underscoring a disparity-induced matter. Considering the limitations of short-read NGS differentiating between certain regions of PMS2 and PMS2CL, using complementary methodologies is imperative to provide an accurate diagnosis.

9
Meta-Analysis of Clinical Phenotype and Patient Survival in Neurodevelopmental Disorder with Microcephaly, Arthrogryposis, and Structural Brain Anomalies Due to Bi-allelic Loss of Function Variants in SMPD4

Marchiori, D.

2022-10-10 genetic and genomic medicine 10.1101/2022.10.08.22280875 medRxiv
Top 0.1%
22.3%
Show abstract

A recently described, rare genetic condition known as Neurodevelopmental Disorder with Microcephaly, Arthrogryposis, and Structural Brain Anomalies (NEDMABA) has been identified in children with bi-allelic loss-of-function variants in SMPD4. The progression of this condition is not well understood with the limited case reports described so far exhibiting a severe and clinically diverse phenotype. A gap exists in the understanding of associations present in the heterogenous features of the clinical phenotype, and the expected survival probabilities of affected individuals. This is driven in part to the paucity of analysis-ready data on reported cases. This analysis aims to collate and standardise available case reports into a common dataset, to analyse and identify meaningful clusters in the clinical phenotype, and to quantify the survival probability for children with NEDMABA. To overcome the challenge of multidimensional data on very few subjects, we employ Multiple Correspondence Analysis (MCA) as a dimension reduction technique, which is then subject to cluster analysis and interpretation. To account for censoring in the data, Kaplan-Meier estimation is formulated to calculate patient survival time. The analysis correctly detected the classic phenotype for this condition, as well as a new distinct feature-cluster relating to findings of vocal cord paralysis, feeding dysfunction and respiratory failure. The survival probability for those affected was found to decline sharply in early infancy with median survival of 150 days, but with some surviving as long as 12.5 years. This wide range of outcomes is provisionally associated with different variant types however this conclusion could not be validated based on very low sample sizes. An R package called SMPD4 was developed to publish standardised analysis-ready datasets used in this study. This analysis represents the first of its kind to help describe associations and trajectories of individuals with this newly reported condition, despite challenges with sparse and inconsistent data. This analysis can provide clinicians and genetic counsellors with better information to aide in decision making and support for families with this rare condition.

10
Heterogeneous Genotype-Phenotype Associations in TRIO-Related Neurodevelopmental Disorder Revealed by Meta-Analysis

Duck, S. A.; George, A. L.

2025-10-21 neurology 10.1101/2025.10.17.25338222 medRxiv
Top 0.1%
20.1%
Show abstract

BackgroundThe trio Rho guanine nucleotide exchange factor gene (TRIO) is highly expressed in the developing brain and contributes to neuronal development, specifically axon guidance, synaptogenesis, and cytoskeleton organization. Pathogenic TRIO variants are associated with a neurodevelopmental disorder with substantial phenotypic heterogeneity. Prior case series suggested genotype-phenotype associations related to variant type and location in the protein. However, these results need validation in a larger sample. The objectives of this research were to examine associations between phenotype, variant location and variant type among previously reported TRIO-related neurodevelopmental disorder cases, and identify recurrent TRIO variants. MethodsEighty-seven previously published studies annotated in the Human Gene Mutation Database reporting at least one TRIO variant were identified. Thirty-two additional cases were ascertained from the Simons Searchlight Study. A total of 699 individual case records were reviewed. After removing redundant cases, 449 unique records remained with available genotype data, of which 228 also had available phenotype information. Along with descriptive statistics, Chi-square analysis was used to test associations between variant and head size. ResultsIn a meta-analysis of reported TRIO variants, categorically-defined head size is associated with variant type (missense vs truncating) and protein domain location ({chi}2 = 39.20; p = <0.001). Specifically, missense variants in the spectrin repeat domain are associated with macrocephaly whereas missense variants outside the spectrin domain and truncating variants are associated with microcephaly. The most prevalent phenotypic features were intellectual disability/developmental delay followed by autism spectrum disorder (ASD) or ASD-like behaviors. Seven recurrent TRIO variants were identified, with head size consistent across cases with the same variant. ConclusionsTRIO variant type and location exhibit unique phenotypic associations. This observation may help clinicians and families to anticipate neurodevelopmental outcomes. Furthermore, identified recurrent variants may serve as targets for future translational and pharmacological research.

11
Investigating a possible role of R3HCC1L in embryonic development and ocular disease

Fischer, M. C.; Reis, L. M.; Muheisen, S.; Seese, S. E.; Semina, E. V.

2024-11-03 genetics 10.1101/2024.10.29.620958 medRxiv
Top 0.1%
19.2%
Show abstract

Peters anomaly (PA) is an anterior segment ocular disorder with wide phenotypic variability and genetic heterogeneity. Here we report a family consisting of a male with a diagnosis of syndromic PA and his unaffected parents, with no causative variants identified in known developmental ocular genes. Exome sequencing analysis identified compound heterozygous missense variants, c. 1022A>T p.(Asp341Val) and c.1457T>A p.(Phe486Tyr), in R3HCC1L. Both variants are ultra-rare in control populations and have CADD scores of 21.9 and 23.3, respectively, suggesting possible deleterious effects. The R3HCC1L transcript variants encode three different protein isoforms all sharing two conserved C-terminal domains, an RNA recognition motif (RRM) and a coiled-coil domain (CCD), and likely represent an RNA-binding protein involved in post-transcriptional gene regulation; the identified patient variants are located within the N-terminal part of the protein shared by two of the three protein isoforms, upstream of the RRM and CCD domains. To investigate the possible role of R3HCC1L in embryonic development, the single zebrafish ortholog of R3HCC1L, r3hcc1l, was examined for its expression and function. In situ expression studies showed that zebrafish r3hcc1l is expressed in the developing lens, cornea, retina and hyaloid vasculature, supporting its possible role in ocular development in vertebrates. CRISPR-Cas9 gene editing was used to generate a zebrafish line with a 4-bp deletion in r3hcc1l, c.623-626del, that is predicted to result in a nonsense-mediated decay and, if expressed, a nonfunctional truncated protein (p.Thr208fs*39) lacking 80% of the RRM and the entire CCD. The resultant r3hcc1lc.623-626del heterozygous and homozygous animals did not show any visible structural abnormalities in the eye or any other systems, with normal survival of all genotypes to adulthood, providing no support for its possible role in the congenital phenotype of interest. However, the function of r3hcc1l may not be completely conserved with human R3HCC1L, and/or zebrafish may have compensatory mechanisms that are not present in humans. In addition, the engineered zebrafish variant disrupts the most conserved C-terminal region of R3HCC1L/r3hcc1l shared by all protein isoforms and likely leads to a complete loss-of-function of this gene, which may be different from the disease mechanism associated with the specific missense alleles identified in the patient. Finally, while it is important to consider the possible limitations of animal models, it is also necessary to highlight that the identified R3HCC1L variants may not have any role in the phenotype observed in this single patient. Identification of new R3HCC1L variants of interest in families affected with PA or other congenital phenotypes, if successful, will provide further support for the possible developmental function of this gene.

12
Mode of inheritance needs to be accounted for in interpreting genotype-phenotype links in monogenic disorders

Riera-Escamilla, A.; Welt, C. K.; Laan, M.

2023-09-27 genetic and genomic medicine 10.1101/2023.09.26.23296051 medRxiv
Top 0.1%
19.1%
Show abstract

IntroductionA recently published study by Ke et al. utilized whole exome sequencing (WES) to screen genetic variants contributing to premature ovarian insufficiency (POI) in a large cohort of 1,030 patients from China (doi: 10.1038/s41591-022-02194-3). The authors reported that 285 likely pathogenic (LP) and pathogenic (P) variants identified in 79 genes contributed to POI in 242 study subjects, representing 23.5% of the cohort. The majority, 191 patients ([~]79%), carried monoallelic (heterozygous) variants. ObjectiveWe re-analyzed the contribution of reported genotypes considering the inheritance mode of POI and other inherited conditions linked to 79 genes with reported findings by Ke et al. MethodsThe disease inheritance modes linked to targeted genes were retrieved from publicly available databases (OMIM, Genomic England PanelApp, PubMed, DOMINO, gnomAD). Genotypes of 242 cases reported by Ke et al. were assessed in the context of known inheritance mode(s) of disorders linked to respective genes. ResultsMost, 48 of 79 genes were classified as recessive, whereas only 13 genes were dominant. Insufficient data was available for 18 genes to conclusively determine their inheritance mode. Nearly half of 242 cases reported by Ke et al., 119 women ([~]49%), carried heterozygous variants in known autosomal recessive genes and therefore these variants are not contributing to their POI phenotype. Only 68 of women (6.6%) carried biallelic variants in either recessive or dominant genes or monoallelic variants in dominant genes, hence contributing to the diagnostic yield. This is [~]3.5-fold lower than 23.5% claimed in Ke et al. Additional 56 women (5.4%) were reported monoallelic variants in genes with insufficient data to determine the inheritance mode or multiple heterozygous variants in >1 recessive gene, whereby oligogenic contribution to POI cannot be excluded. But when even including these cases, the maximum estimated contributing yield is [~]12%, two times lower than claimed. ConclusionUsing WES to screen monogenic causes of POI as part of the diagnostic pipeline will improve patient management strategies, but overestimated diagnostic yield in genetic research can create unrealistic expectations in the POI clinical community, typically non-specialist in genetics.

13
Benchmarking Polygenic Risk Score Model Assumptions: towards more accurate risk assessment

Kulm, S.; Mezey, J.; Elemento, O.

2022-02-18 genomics 10.1101/2022.02.18.480983 medRxiv
Top 0.1%
18.5%
Show abstract

Polygenic risk scores represent an individuals genetic susceptibility to a phenotype. Like with any models, statistical models commonly employed to fit polygenic risk scores and assess their accuracy contain several assumptions. The effects of these assumptions on models of polygenic risk score have not been thoroughly assessed. We assessed 26 variations of the traditional polygenic risk score model, each of which mitigate assumptions in one of five facets of disease modelling: representation of age (6 variations), censorship (3 variations), competing risks (7 variations), formation of disease labels (6 variations), and selection of covariates (4 variations). With data from the UK Biobank, each model variation included age, sex, and a polygenic risk score derived from the PGS Catalog. Each of the 26 model variations were fitted to predict 18 diseases. Compared to the plain model that contained all five facets of assumptions, the model variations often fit the data better and generated predictions that largely differed from the predictions of the plain model. The statistic Roystons R2 measured a models goodness of fit, and thereby determined if the model was an enhancement upon the plain model. For 15 of the 26 model variations Roystons R2 was greater than that of the plain model for >50% of diseases. Reclassification rates, defined as the fraction of individuals in the top five percentiles of the plain models predictions who are not in the top five percentiles of a model variations predictions, was used to determine if the variation led to significantly different predictions. For 20 of the 26 model variations the median reclassification rate calculated across the 18 diseases was greater than 10%. Comparisons of accuracy statistics further illustrated how much each model variations predictions differed from the plain models predictions. Models containing polygenic risk scores appear to be significantly affected by many common modelling assumptions. Therefore, future investigations should consider taking some action to mitigate modelling assumptions. Author SummaryAn individuals genetics can increase their risk of experiencing a disease. The exact magnitude of the increased risk is estimated within a statistical model. The traditional model type employed in this process is relatively plain and contains several assumptions. The predicted risk estimates from this plain model may be unnecessarily inaccurate. To test this possibility, we searched the literature for model variations that reduce the assumptions of the plain model, ultimately creating 26 distinct model variations that may improve upon the plain model. Each model variation was fit with data from the UK Biobank to predict 18 diseases. We found that 15 of the 26 models variations fit the data better than the plain model for a majority of diseases. Goodness of fit was measured with Roystons R2 statistic. Further calculations found that the predictions of the model variations were often significantly more or less accurate than the predictions of the plain model. We believe these results indicate that future investigations of polygenic risk scores should not employ the plain model, as unreliable risk predictions will likely result.

14
The genomic landscape of syndromic and non-syndromic hearing loss within the 100,000 Genomes Project cohort

Vestito, L.; Smedley, D.; Cipriani, V.; Moore, G. E.; Stanier, P.; Bowl, M. R.; Dawson, S. J.; Clement, E.; Bitner-Glindzicz, M.

2025-02-07 genetic and genomic medicine 10.1101/2025.02.06.25321804 medRxiv
Top 0.1%
18.3%
Show abstract

ObjectiveThis study aims to describe the genetic landscape of syndromic and non-syndromic hearing loss (HL) in the UK population using data from the 100,000 Genomes Project (100kGP). DesignCohort study SettingNHS England Participants2,271 families with syndromic and non-syndromic HL recruited to the 100kGP rare disease programme between 2013 and 2018. Participants with at least one Human Phenotype Ontology (HPO) term descendant of the term "Hearing impairment" (HP:0000365) were included; this equated to 5,488 individuals, comprising 2,762 affected individuals and 2,726 unaffected relatives. Main outcome measureDiagnostic rate and prevalence of different gene diagnoses by auditory phenotype identified by whole genome sequencing. ResultsThe overall diagnostic yield was conservatively estimated at 27.5% (625/2271), with diagnoses identified in 273 different genes. Common causative genes included USH2A, GJB2, COL1A1 and MYO15A, accounting for approximately 20% of the diagnoses. This diagnostic rate excludes variants of uncertain significance (VUS), variants in genes where HL cannot be confidently attributed to the identified variant, or those still awaiting confirmation. The inclusion of these categories would increase the diagnostic yield to 39.6%. This work describes the 100kGP standard pipeline and supplementary analyses that include the use of Exomiser. Stratification of the cohort allowed quantification of the likelihood of genetic diagnosis with specific phenotypic combinations and identification of positive predictors for a genetic diagnosis by auditory phenotype. A statistically significant increase in diagnostic rate was reported for those with congenital (33.2%), bilateral (27%), and high-frequency (32.4%) hearing subtypes. Furthermore, in patients with HPO terms restricted to the auditory system alone, around 40% of diagnoses were attributed to genes that might have a broader syndromic phenotype (non-syndromic mimics). A high diagnostic yield (56%) was seen in patients with ear and eye abnormalities, largely driven by genes associated with Usher and Wolfram syndrome. ConclusionIn conclusion, this study offers valuable insights into the complex genomic and phenotypic architecture of both syndromic and non-syndromic HL, which has the potential to improve diagnostic pipelines and inform clinical care.

15
Validation of the ACMG/AMP guidelines-based seven-category variant classification system

Chen, J.-M.; Masson, E.; Zou, W.-B.; Liao, Z.; Genin, E.; Cooper, D. N.; Ferec, C.

2023-01-25 genetic and genomic medicine 10.1101/2023.01.23.23284909 medRxiv
Top 0.1%
18.3%
Show abstract

BackgroundOne shortcoming of employing the American College of Medical Genetics and Genomics/Association for Molecular Pathology (ACMG/AMP)-recommended five-category variant classification scheme ("pathogenic", "likely pathogenic", "uncertain significance", "likely benign" and "benign") in medical genetics lies in the schemes inherent inability to deal properly with variants that fall midway between "pathogenic" and "benign". Employing chronic pancreatitis as a disease model, and focusing on the four most studied chronic pancreatitis-related genes, we recently expanded the five-category ACMG/AMP scheme into a seven-category variant classification system. With the addition of two new classificatory categories, "predisposing" and "likely predisposing", our seven-category system promises to provide improved classification for the entire spectrum of variants in any disease-causing gene. The applicability and practical utility of our seven-category variant classification system however remains to be demonstrated in other disease/gene contexts, and this has been the aim of the current analysis. ResultsWe have sought to demonstrate the potential universality of pathological variants that could be ascribed the new variant terminology ( predisposing) by trialing it across three Mendelian disease contexts (i.e., autosomal dominant, autosomal recessive and X-linked). To this end, we firstly employed illustrative genes/variants characteristic of these three contexts. On the basis of our own knowledge and expertise, we identified a series of variants that fitted well with our "predisposing" category, including "hypomorphic" variants in the PKD1 gene and "variants of varying clinical consequence" in the CFTR gene. These examples, followed by reasonable extrapolations, enabled us to infer the widespread occurrence of "predisposing" variants in disease-causing genes. Such "predisposing" variants are likely to contribute significantly to the complexity of human genetic disease and may account not only for a considerable proportion of the unexplained cases of monogenic and oligogenic disease but also for much of the "missing heritability" characteristic of complex disease. ConclusionEmploying an evidence-based approach together with reasonable extrapolations, we demonstrate both the applicability and utility of our seven-category variant classification system for disease-causing genes. The recognition of the new "predisposing" category not only has immediate implications for variant detection and interpretation but should also have important consequences for reproductive genetic counseling.

16
De novo variants of NALCN differentially impact both the phenotypic spectrum of patients and the biophysical properties of the NALCN current

HADOUIRI, N.; GARCIA, L. P.; BAUDAT, R.; PARRA-DIAZ, P.; GIL-NAGEL REIN, A.; DEL PINO, I.; WHALEN, S.; GROTTO, S.; BRUNET, T.; BRUGGER, M.; MARAFI, D.; VILL, K.; LEDERER, D.; KARADURMUS, D.; DESIR, J.; NASSOGNE, M. C.; FAVIER, M.; SRIVASTAVA, S.; BRISCHOUX BOUCHER, E.; LEVY, J.; YOUNG, D.; HORVATH, G.; MAREY, I.; DIETERICH, K.; FIORILLO, C.; WEIGAND, H.; HANNANE, N.; SHILLINGTON, A.; STANGE, L.; DAGLI, A.; ARGILLI, E.; LE, C.; SHERR, E. H.; LEE, B. H.; GATES, R. W.; MAYSTADT, I.; DEPREZ, M.; LESCA, G.; RODE, G.; RUAULT, V.; SOLIANI, L.; LANZARINI, E.; EATON, A. J.; MORNEAU-JACOB, F. D.; PRINZI

2025-06-22 neurology 10.1101/2025.06.20.25329825 medRxiv
Top 0.1%
18.3%
Show abstract

The Na+ leak channel NALCN regulates the resting membrane potential and consequently cell excitability of several cell types, including neurons. Studies of animal models demonstrated that NALCN is involved in fundamental physiological functions such as respiratory rhythm, circadian rhythm, sleep, locomotor behavior and pain perception. Pathogenic variants of NALCN have been associated with ultra-rare developmental disorders characterized by a wide range of symptoms with variable severity. We and others previously showed that pathogenic variants of NALCN can be categorized in 2 groups. The first group corresponds to inherited biallelic loss-of-function variants with patients suffering from the IHPRF1 syndrome (OMIM #615419). The second one corresponds to de novo gain-of-function variants that cause the CLIFAHDD syndrome (OMIM #616266). In this study, we provide a standardized phenotypic description of a large group of 35 individuals with de novo pathogenic variants of NALCN. In addition, we performed functional studies of several of these variants using the patch clamp technique in a recombinant system. We highlight a large heterogeneity in terms of both expressed symptoms and their severity. By contrast with previous reports only showing a pure gain-of-function effect of de novo pathogenic variants, we found that de novo variants of NALCN differentially impact the biophysical properties of the NALCN current and likely influence cell excitability. To conclude, de novo variants of NALCN differentially impact the biophysical properties of the NALCN current. We hypothesize that this may at least partly explain the phenotypic diversity observed in patients.

17
Next generation sequencing identifies a pattern of novel germline variants in early-onset colorectal cancer

VANDE PERRE, P.; AL SAATI, A.; CABARROU, B.; PLENECASSAGNES, J.; GILHODES, J.; MONSELET, N.; LIGNON, N.; FILLERON, T.; VILLARZEL, C.; GOURDAIN, L.; SELVES, J.; MARTINEZ, M.; CHIPOULET, E.; COLLET, G.; MALLET, L.; BONNET, D.; GUIMBAUD, R.; Toulas, C.

2024-12-12 genetics 10.1101/2024.12.09.627474 medRxiv
Top 0.1%
18.2%
Show abstract

Early-onset colorectal cancer (EOCRC) incidence is increasing rapidly worldwide. However, the majority of EOCRCs are not substantiated by germline variants in the main colorectal cancer (CRC) predisposition genes (the "DIGE" panel). To investigate a potential genetic transmission of EOCRC (dominant, recessive and oligogenic hypotheses) and thus identify potentially novel EOCRC-specific predisposition genes, we conducted an analysis of 585 cancer pathway genes on an EOCRC patient cohort (n=87 patients diagnosed at [&le;] 40 years of age, DIGE-) with or without a CRC family history. By comparing this germline variant spectrum to the GnomAD cancer-free database, we identified high impact variants (HVs) in 15 genes significantly over-represented in the EOCRC cohort. Among the 32 unrelated patients with a CRC family history (i.e. with a potentially dominant transmission pattern), nine presented HVs in ten of the genes tested, four of these genes had a DNA repair function. A potentially recessive transmission of EOCRC in patients without a CRC family history cannot be supported by our results nor can an oligogenic transmission. We subsequently sequenced these 15 genes in a cohort of 82 late-onset CRCs (cancer diagnosis [&ge;]50 years, DIGE-) and found variants in 11 of these genes to be specific to EOCRC. To evaluate whether variants in these 11 genes would allow to specifically detect EOCRC patients, we screened our patient database (n=6482), which only contained 2% of EOCRCs (DIGE-), and identified two other EOCRC cases diagnosed after the constitution of our cohort, with individual HVs in RECQL4 and NUTM1. Altogether, we showed that 37.5% and 18.75% of heterozygous NUTM1 and RECQL4 HVs of our database were diagnosed with EOCRC. Our work has identified a pattern of germline gene variants not previously associated with EOCRC. This paves the way to addressing the contribution of these variants to EOCRC risk and oncogenesis. Author SummaryEarly-onset colorectal cancer (diagnosed at [&le;] 40 years of age) is a rare disease that can in part be explained by a hereditary genetic predisposition. To identify novel gene variants potentially associated with EOCRC risk, we analysed a panel of 585 genes in 87 patients with early-onset colorectal cancer unexplained by conventional genetic tests. This first analysis highlighted 15 genes of interest. To evaluate if this genetic profile is specific to early onset, we sequenced these 15 genes in a population of late-onset colorectal cancers (diagnosed after 50 years of age). Variants in 11 of these genes were specific to the early-onset population. To assess if this genetic pattern allows to identify other early-onset cases, we screened these genes in our whole database of 6482 patients and identified two new early-onset cases. Our results need to be confirmed, and validated in larger cohorts but pave the way for future research into early-onset colorectal cancer and the possibility of improving screening or treatment options for these patients and their family members.

18
Phenotype correlations with pathogenic DNA variants in the MUTYH gene

Thet, M.; Plazzer, J. P.; Capella, G.; Latchford, A.; Nadeau, E. A. W.; Greenblatt, M. S.; Macrae, F.

2024-05-15 genetic and genomic medicine 10.1101/2024.05.15.24307143 medRxiv
Top 0.1%
17.7%
Show abstract

MUTYH-associated polyposis (MAP) is an autosomal recessive disorder where the inheritance of constitutional biallelic pathogenic MUTYH variants predisposes a person to the development of adenomas and colorectal cancer (CRC). It is also associated with extracolonic and extraintestinal manifestations that may overlap with the phenotype of familial adenomatous polyposis (FAP). Currently, there are discrepancies in the literature regarding whether certain phenotypes are truly associated with MAP. This narrative review aims to explore the phenotypic spectrum of MAP to better characterise the MAP phenotype. A literature search was conducted to identify articles reporting on MAP-specific phenotypes. Clinical data from 2109 MAP patients identified from the literature showed that 1123 patients (53.2%) had CRC. Some patients with CRC had no associated adenomas, suggesting that adenomas are not an obligatory component of MAP. Carriers of the two missense founder variants, and possibly truncating variants, had an increased cancer risk when compared to those who carry other pathogenic variants. It has been suggested that somatic G:C>T:A transversions are a mutational signature of MAP, and could be used as a biomarker in screening and identifying patients with atypical MAP, or in associating certain phenotypes with MAP. The extracolonic and extraintestinal manifestations that have been associated with MAP include duodenal adenomas, duodenal cancer, fundic gland polyps, gastric cancer, ovarian cancer, bladder cancer and skin cancer. The association of breast cancer and endometrial cancer with MAP remains disputed. Desmoids and Congenital Hypertrophy of the Retinal Pigment Epithelium (CHRPEs) are rarely reported in MAP, but have long been seen in FAP patients, and thus could act as a distinguishing feature between the two. This collection of MAP phenotypes will assist in the assessment of pathogenic MUTYH variants using the American College of Medical Genetics and the Association for Molecular Pathology (ACMG/AMP) Variant Interpretation Guidelines, and ultimately improve patient care.

19
Loss-of-function variants in JPH1 cause congenital myopathy with prominent facial involvement

Johari, M.; Topf, A.; Folland, C.; Duff, J.; Dofash, L.; Marti, P.; Robertson, T.; Vilchez, J.; Cairns, A.; Harris, E.; Marini-Bettolo, C.; Ravenscroft, G.; Straub, V.

2024-02-11 neurology 10.1101/2024.02.10.24302480 medRxiv
Top 0.1%
17.6%
Show abstract

BackgroundWeakness of facial, ocular, and axial muscles is a common clinical presentation in congenital myopathies caused by pathogenic variants in genes encoding triad proteins. Abnormalities in triad structure and function resulting in disturbed excitation-contraction coupling and Ca2+ homeostasis can contribute to disease pathology. MethodsWe analysed exome and genome sequencing data from three unrelated individuals with congenital myopathy characterised by striking facial, ocular, and bulbar involvement. We collected deep phenotypic data from the affected individuals. We analysed the RNA-seq data of one proband and performed gene expression outlier analysis in 129 samples. ResultsThe three probands had remarkably similar clinical presentation with prominent facial, ocular, and bulbar features. Disease onset was in the neonatal period with hypotonia, poor feeding, cleft palate and talipes. Muscle weakness was generalised but most prominent in the lower limbs with facial weakness also present. All patients had myopathic facies, bilateral ptosis, ophthalmoplegia and fatiguability. While muscle biopsy on light microscopy did not show any obvious morphological abnormalities, ultrastructural analysis showed slightly reduced triads, and structurally abnormal sarcoplasmic reticulum. DNA sequencing identified three unique homozygous loss of function variants in JPH1, encoding junctophilin-1 in the three families; a stop-gain (c.354C>A; p.Tyr118*) and two frameshift (c.373del p.Asp125Thrfs*30 and c.1738del; p.Leu580Trpfs*16) variants. Muscle RNA-seq showed strong downregulation of JPH1 in the F3 proband. ConclusionsJunctophilin-1 is critical to the formation of skeletal muscle triad junctions by connecting the sarcoplasmic reticulum and T-tubules. Our findings suggest that loss of JPH1 results in a congenital myopathy with prominent facial, bulbar and ocular involvement. Key messageThis study identified novel homozygous loss-of-function variants in the JPH1 gene, linking them to a unique form of congenital myopathy characterised by severe facial and ocular symptoms. Our research sheds light on the critical impact on junctophilin-1 function in skeletal muscle triad junction formation and the consequences of its disruption resulting in a myopathic phenotype. What is already known on this topicPrevious studies have shown that pathogenic variants in genes encoding triad proteins lead to various myopathic phenotypes, with clinical presentations often involving muscle weakness and myopathic facies. The triad structure is essential for excitation-contraction (EC) coupling and calcium homeostasis and is a key element in muscle physiology. What this study adds and how this study might affect research, practice or policyThis study establishes that homozygous loss-of-function mutations in JPH1 cause a congenital myopathy predominantly affecting facial and ocular muscles. This study also provides clinical insights that may aid the clinicians in diagnosing similar genetically unresolved cases.

20
Confirmation of the MIR204 n.37C>T heterozygous variant as a cause of chorioretinal dystrophy variably associated with iris coloboma, early-onset cataracts and congenital glaucoma

Jedlickova, J.; Vajter, M.; Barta, T.; Black, G. C.; Mares, J.; Fichtl, M.; Kousal, B.; Dudakova, L.; Liskova, P.

2023-02-11 ophthalmology 10.1101/2023.02.09.23284763 medRxiv
Top 0.1%
17.3%
Show abstract

Four members of a three-generation family with early-onset chorioretinal dystrophy were shown to be heterozygous carriers of the n.37C>T in MIR204. The identification of this previously reported pathogenic variant confirms the existence of a distinct clinical entity caused by a sequence change in MIR204. The chorioretinal dystrophy was variably associated with iris coloboma, congenital glaucoma, and premature cataracts extending the phenotypic range of the condition. In silico analysis of the n.37C>T variant revealed 713 novel targets. Additionally, family members were shown to be affected by albinism resulting from biallelic pathogenic OCA2 variants.